A new pitch modeling approach for Mandarin speech
نویسندگان
چکیده
In this paper, a new approach to model syllable pitch contour for Mandarin speech is proposed. It takes the mean and shape of syllable pitch contour as two basic modeling units and considers several affecting factors that contribute to their variations. Parameters of the two models are automatically estimated by the EM algorithm. Experimental results showed that RMSEs of 0.551 ms and 0.614 ms in the reconstructed pitch were obtained for the closed and open tests, respectively. All inferred values of those affecting factors agreed well with our prior linguistic knowledge. Besides, the prosodic states automatically labeled by the pitch mean model provided useful cues to determine the prosodic phrase boundaries occurred at inter-syllable locations without punctuation marks. So it is a promising pitch modeling approach.
منابع مشابه
Incorporating Pitch Features for Tone Modeling in Automatic Recognition of Mandarin Chinese
Tone plays a fundamental role in Mandarin Chinese, as it plays a lexical role in determining the meanings of words in spoken Mandarin. For example, these two sentences R R (I like horses) and R M (I like to scold) differ only in the tone carried by the last syllable. Thus, the inclusion of tone-related information through analysis of pitch data should improve the performance of automatic speech...
متن کاملPhonetic state tied-mixture tone modeling for large vocabulary continuous Mandarin speech recognition
This paper presents a new approach to tone modeling for continuous Mandarin speech recognition. Mandarin tones provide rich information for speech recognition. In this paper, we treat the tone as an attribute of the final vowel part of a Mandarin syllable. Separate distributions are estimated for cepstral coefficients and pitch features respectively, and the phonetic state tied-mixture techniqu...
متن کاملGeneration of Fundamental Frequency Contours of Mandarin in HMM-based Speech Synthesis using Generation Process Model
The HMM-based speech synthesis system can produce high quality synthetic speech with flexible modeling of spectral and prosodic parameters. In this approach, short term spectra, fundamental frequency (F0) and duration are generated by multi-stream HMMs separately. However the quality of synthetic speech degrades when feature vectors used in training are noisy. Among all noisy features, pitch tr...
متن کاملOn the inter-syllable coarticulation effect of pitch modeling for Mandarin speech
In this paper, a new statistics-based pitch model for Mandarin speech is proposed. The model considers three major affecting factors on the syllable pitch contour, including lexical tone, prosodic state and inter-syllable coarticulation effect. The study emphasizes on the modeling of inter-syllable coarticulation effect. Interactive affections of neighboring tones and different inter-syllable c...
متن کاملExperiments on Chinese Speech Recog Pitch Estimation Using the M
Automatic speech recognition of a tonal and syllabic language such as Chinese Mandarin poses new challenges but also offers new opportunities. We present approaches and experimental results concerning the choice of base units for acoustic modeling, pitch estimation and how to integrate pitch estimates into the modeling framework. The experimental evaluations are carried out both on rather clean...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003